IKEA: Data of IKEA Furniture from Kaggle


Description
This is my first #TidyTuesday plot! This plot is kind of basic 4 panel bar chart (which is my favorite type of chart) which leans into my preference of making compound plots (packing in a lot of information). The bulk of the work is some basic regex on the designer names, since it’s a string whereby more than one designer is included.

Code and tweet
Github
Tweet

Mobile Phones: Historical Phone Usage from OurWorldInData.org


Description
This submission is a bit more complicated that it is an animation using gganimate. The plot itself is still pretty basic (still a bar chart). The graph displays the relative proportions of phones as mobile versus landline over time across different continents. Colors using ggsci.

Code and tweet
Github
Tweet

Hiking: Trail information from Washington Trails Association


Description
This is another compound graph with multiple panels using facet_grid. This plot explores mosaic plots using `ggmosaic, which can show the relationship between two categorical variables with more than 2 categories. Box size indicate magnitude of frequency of a certain combination of categories (for example, the frequency of wildlife as a feature among highly rated trails). Using these plots, you can see the overrepresentation of some trail features between highly rated and lowly rated trails. For example, in short trails with tall high points, the presence of established campsites usually indicate a higher rating. However, that is not the same for longer trails of similar heights. Colors using ggsci.

Code and tweet
Github
Tweet

Shelters: Toronto Shelters from the opendatatoronto R package


Description
One thing I learned from this visualization is that maps are so difficult! In this visual I used ggmap to generate geocode from address text and overlay a basic Google street map on top. I initialy tried to use some of the nicer black and white maps but for some reason they never work. This plot then turned into an animation where I plot the size of the shelter (as point size) and whether they’re full (using color) across time. The animation was done using gganimate. There were some issues with disappearing shelters due to possibly missing data, and the plot doesn’t look too great. However, maps are hard!

Code and tweet
Github
Tweet

BBC Women of 2020


Description
This is a basic 4-panel bar chart once again but the twist this time is that this is used after processing description text data using the tidytext package and its associated book! Being able to see the relative frequency of terms (using tf-idf) describing the amazing individuals on the list, stratified by each category, is surprisingly cool and informative!

Code and tweet
Github
Tweet

American Ninja: Data about obstacles and location by round and season from the American Ninja Warrior show


Description
This is one of my favorite figures to generate! This is an attempt to make a more visually appealing figure using the circlize package. Even though this package use base R graphics instead of the tidyverse, the result is something that looks really neat. Additionally, I don’t rely on the ggsci package for colors anymore but instead using approximately the palette from the show, which gives it a bit more visual cohesion, especially when relating to the theme.

Code and tweet
Github
Tweet

Big Mac Index: Latest analysis on the famed Big Mac index from The Economist


Description
I was surprised how nice this graph turned out to be and how it kind of replicates the “The Economist” look. The graph is a basic graph of boxplots showing the distribution of percentage of wealth in the top 10% (as a proxy for income inequality) across years and categories of currency valuation using the Big Mac index (both GDP adjusted and raw index). This plot shows that when using raw index, it might seem that countries with higher income inequality have their currencies more under-valuated, but when adjusting for GDP this relationship reverses. The most difficult part of this graphic is drawing external data and matching them up using country codes. I ended up using the wbstat package to pull in extra income information from the World Bank.

Code and tweet
Github
Tweet

Plastics: Plastic pollution from the Break Free from Plastic Organization


Description
When I write academic papers, the goal is to cram as much information as possible into one graph, which induces the habit of producing complicated multi-panel with variations in color, linetype, and shape, etc. This graph is an attempt to generate a simple graph (circular again using coord_polar, inspired by [@ijeamaka_a](https://twitter.com/ijeamaka_a)). This visual focuses on the three largest polluters and where they are polluting. The result is something that is a bit more punchy, and easier to understand! (with commentary text). Colors still using ggsci.

Code and tweet
Github
Tweet

Post Office: US post office data from 1639-2000


Description
This graph allows me to play around with additional geoms in the ggplot ecosystem, this time specifically on the ggstream plot (geom_stream is the geom). The result is a really nice plot where the overall number is shown alongside a breakdown by category (regions in this case) across time! There is a bit more custom theme-ing in this case (I changed 15 options in theme), and I think I’ve understood more how to mold ggplot to do exactly what I want it to do. This includes changing the background, text contrast and axes. Colors this time through ggpomological. In the calculations of the number of post offices active at each point in time, I treated an NA end date as if the post office is still in operation. I also ignored the fact that for some post offices, they did not operate continuously throughout the years it was open.

Code and tweet
Github
Tweet